Download Robust multipich estimation for the analysis and manipulation of polyphonic musical signals
A method for the estimation of the multiple pitches of concurrent musical sounds is described. Experimental data comprised sung vowels and the whole pitch range of 26 musical instruments. Multipitch estimation was performed at the level of a single time frame for random pitch and sound source combinations. Note error rates for mixtures ranging from one to six simultaneous sounds were 2.1 %, 2.4 %, 3.8 %, 8.1 %, 12 %, and 18 %, respectively. In musical interval and chord identification tasks, the algorithm outperformed the average of ten trained musicians. Particular emphasis was laid on robustness in the presence of other sounds and noise. The algorithm is based on an iterative estimation and separation procedure and is able to resolve at least a couple of most prominent pitches even in ten sound polyphonies. Sounds that exhibit inharmonicities can be handled without problems, and the inharmonicity factor and spectral envelope of each sound is estimated along with the pitch. Examples are given of musical signal manipulations that become possible with the proposed method.
Download Model-based event labeling in the transcription of percussive audio signals
In this paper we describe a method for the transcription of percussive audio signals which have been performed with arbitrary nondrum sounds. The system locates sound events from the input signal using an onset detector. Then a set of features is extracted from the onset times. Feature vectors are clustered and the clusters are assigned with labels which describe the rhythmic role of each event. For the labeling, a novel method is proposed which is based on metrical (temporal) positions of the sound events within the measures. The system is evaluated using monophonic percussive tracks consisting of non-drum sounds. In simulations, the system achieved a total error rate of 33.7%. Demo signals are available at URL:<http://www.cs.tut.fi/~paulus/demo/>.
Download Acoustic features for music piece structure analysis
Automatic analysis of the structure of a music piece aims to recover its sectional form: segmentation to musical parts, such as chorus or verse, and detecting repeated occurrences. A music signal is here described with features that are assumed to deliver information about its structure: mel-frequency cepstral coefficients, chroma, and rhythmogram. The features can be focused on different time scales of the signal. Two distance measures are presented for comparing musical sections: “stripes” for detecting repeated feature sequences, and “blocks” for detecting homogenous sections. The features and their time scales are evaluated in a systemindependent manner. Based on the obtained information, the features and distance measures are evaluated in an automatic structure analysis system with a large music database with manually annotated structures. The evaluations show that in a realistic situation, feature combinations perform better than individual features.
Download Application of non-negative matrix factorization to signal-adaptive audio effects
This paper proposes novel audio effects based on manipulating an audio signal in a representation domain provided by non-negative matrix factorization (NMF). Critical-band magnitude spectrograms Y of sounds are first factorized into a product of two lower-rank matrices so that Y ≈ BG. The parameter matrices B and G are then processed in order to achieve the desired effect. Three classes of effects were investigated: 1) dynamic range compression (or expansion) of the component spectra or gains, 2) effects based on rank-ordering the components (colums of B and the corresponding rows of G) according to acoustic features extracted from them, and then weighting each component according to its rank, and 3) distortion effects based on controlling the amount of components (and thus the reconstruction error) in the above linear approximation. The subjective quality of the effects was assessed in a listening test.
Download Pitch Shifting of Audio Signals Using the Constant-Q Transform
Pitch-scale modifications of polyphonic music are usually performed by manipulating the time-frequency representation of the input signal. Most approaches proposed in the past are thereby based on the Fourier transform although its linear frequency bin spacing is known to be inadequate to some degree for analysing and processing music signals. Recently invertible constant-Q transforms (CQT) featuring high Q-factors have been proposed exhibiting a more suitable geometrical bin spacing. In this paper a frequency domain pitch-shifting approach based on the CQT is proposed. The CQT is specifically attractive for pitch-shifting because it can be implemented by frequency translation (shifting partials along the frequency axis) as opposed to spectral stretching in the Fourier transform domain. Furthermore, the high time resolution of CQT at high frequencies improves transient preservation. Audio examples are provided to illustrate the results achieved with the proposed method.
Download The Wablet: Scanned Synthesis on a Multi-Touch Interface
This paper presents research into scanned synthesis on a multitouch screen device. This synthesis technique involves scanning a wavetable that is dynamically evolving in the manner of a massspring network. It is argued that scanned synthesis can provide a good solution to some of the issues in digital musical instrument design, and is particularly well suited to multi-touch screens. In this implementation, vibrating mass-spring networks with a variety of configurations can be created. These can be manipulated by touching, dragging and altering the orientation of the tablet. Arbitrary scanning paths can be drawn onto the structure. Several extensions to the original scanned synthesis technique are proposed, most important of which for multi-touch implementations is the freedom of the masses to move in two dimensions. An analysis of the scanned output in the case of a 1D ideal string model is given, and scanned synthesis is also discussed as being a generalisation of a number of other synthesis methods.